Search results for "random projection"

showing 8 items of 8 documents

Do Randomized Algorithms Improve the Efficiency of Minimal Learning Machine?

2020

Minimal Learning Machine (MLM) is a recently popularized supervised learning method, which is composed of distance-regression and multilateration steps. The computational complexity of MLM is dominated by the solution of an ordinary least-squares problem. Several different solvers can be applied to the resulting linear problem. In this paper, a thorough comparison of possible and recently proposed, especially randomized, algorithms is carried out for this problem with a representative set of regression datasets. In addition, we compare MLM with shallow and deep feedforward neural network models and study the effects of the number of observations and the number of features with a special dat…

0209 industrial biotechnologyrandom projectionlcsh:Computer engineering. Computer hardwareComputational complexity theoryComputer scienceRandom projectionlcsh:TK7885-789502 engineering and technologyMachine learningcomputer.software_genresupervised learningapproximate algorithmsSet (abstract data type)regressioanalyysi020901 industrial engineering & automationdistance–based regressionalgoritmit0202 electrical engineering electronic engineering information engineeringordinary least–squaresbusiness.industrySupervised learningsingular value decompositionminimal learning machineMultilaterationprojektioRandomized algorithmkoneoppiminenmachine learningScalabilityFeedforward neural network020201 artificial intelligence & image processingArtificial intelligenceapproksimointibusinesscomputerMachine Learning and Knowledge Extraction
researchProduct

Automatic Image Annotation Using Random Projection in a Conceptual Space Induced from Data

2018

The main drawback of a detailed representation of visual content, whatever is its origin, is that significant features are very high dimensional. To keep the problem tractable while preserving the semantic content, a dimen- sionality reduction of the data is needed. We propose the Random Projection techniques to reduce the dimensionality. Even though this technique is sub-optimal with respect to Singular Value Decomposition its much lower computational cost make it more suitable for this problem and in par- ticular when computational resources are limited such as in mobile terminals. In this paper we present the use of a "conceptual" space, automatically induced from data, to perform automa…

Computer sciencebusiness.industryDimensionality reductionRandom projectionFeature extractionRANDOM MAPPINGPattern recognition02 engineering and technology010501 environmental sciencesConceptual-space01 natural sciencesVisualizationAutomatic image annotationRandom-projectionHistogramSingular value decomposition0202 electrical engineering electronic engineering information engineeringImage-semantic020201 artificial intelligence & image processingArtificial intelligenceIMAGE ANNOTATIONbusinessCONCEPTUAL SPACE0105 earth and related environmental sciencesCurse of dimensionality
researchProduct

The impact of feature extraction on the performance of a classifier : kNN, Naïve Bayes and C4.5

2005

"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and the classification error in high dimensions. In this paper, different feature extraction techniques as means of (1) dimensionality reduction, and (2) constructive induction are analyzed with respect to the performance of a classifier. Three commonly used classifiers are taken for the analysis: kNN, Naïve Bayes and C4.5 decision tree. One of the main goals of this paper is to show the importance of the use of class information in feature extraction for classification and (in)appropriateness of random projection or conventional PCA to feature extraction for …

Covariance matrixComputer sciencebusiness.industryRandom projectionDimensionality reductionFeature extractionLinear classifierPattern recognitionMachine learningcomputer.software_genreNaive Bayes classifierComputingMethodologies_PATTERNRECOGNITIONPrincipal component analysisArtificial intelligencebusinesscomputerCurse of dimensionalityAdvances in artificial intelligence : 18th conference of the canadian society for computational Studies of Intelligence, Canadian AI 2005, Victoria, Canada, May 9-11, 2005 : proceedings
researchProduct

Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis

2020

With the growth of online social network platforms and applications, large amounts of textual user-generated content are created daily in the form of comments, reviews, and short-text messages. As a result, users often find it challenging to discover useful information or more on the topic being discussed from such content. Machine learning and natural language processing algorithms are used to analyze the massive amount of textual social media data available online, including topic modeling techniques that have gained popularity in recent years. This paper investigates the topic modeling subject and its common application areas, methods, and tools. Also, we examine and compare five frequen…

Topic modelshort textInformation retrievalSocial networkbusiness.industryLatent semantic analysisComputer scienceRandom projectiontopic modelingUser-generated contentSubject (documents)Context (language use)Latent Dirichlet allocationlcsh:QA75.5-76.95symbols.namesakeArtificial Intelligenceonline social networkssymbolsMethodslcsh:Electronic computers. Computer sciencenatural language processingbusinessuser-generated contentFrontiers in Artificial Intelligence
researchProduct

Improving Scalable K-Means++

2021

Two new initialization methods for K-means clustering are proposed. Both proposals are based on applying a divide-and-conquer approach for the K-means‖ type of an initialization strategy. The second proposal also uses multiple lower-dimensional subspaces produced by the random projection method for the initialization. The proposed methods are scalable and can be run in parallel, which make them suitable for initializing large-scale problems. In the experiments, comparison of the proposed methods to the K-means++ and K-means‖ methods is conducted using an extensive set of reference and synthetic large-scale datasets. Concerning the latter, a novel high-dimensional clustering data generation …

random projectionlcsh:T55.4-60.8K-means++algoritmitclustering initializationalgoritmiikkalcsh:Industrial engineering. Management engineeringklusterianalyysilcsh:Electronic computers. Computer sciencetiedonlouhintaK-means‖lcsh:QA75.5-76.95
researchProduct

Improvements and applications of the elements of prototype-based clustering

2018

Clustering or cluster analysis is an essential part of data mining, machine learning, and pattern recognition. The most popularly applied clustering methods are partitioning-based or prototype-based methods. Prototype-based clustering methods usually have easy implementability and good scalability. These methods, such as K-means clustering, have been used for different applications in various fields. On the other hand, prototype-based clustering methods are typically sensitive to initialization, and the selection of the number of clusters for knowledge discovery purposes is not straightforward. In the era of big data, in high-velocity, ever-growing datasets, which can also be erroneous, outl…

random projectionparallel computingknowledge discoveryclustering initializationminimal learning machinedata miningprototype-based clusteringmachine learningkoneoppiminenbig datarinnakkaiskäsittelyklusterianalyysitiedonlouhintarobust clusteringK-means
researchProduct

Online anomaly detection using dimensionality reduction techniques for HTTP log analysis

2015

Modern web services face an increasing number of new threats. Logs are collected from almost all web servers, and for this reason analyzing them is beneficial when trying to prevent intrusions. Intrusive behavior often differs from the normal web traffic. This paper proposes a framework to find abnormal behavior from these logs. We compare random projection, principal component analysis and diffusion map for anomaly detection. In addition, the framework has online capabilities. The first two methods have intuitive extensions while diffusion map uses the Nyström extension. This fast out-of-sample extension enables real-time analysis of web server traffic. The framework is demonstrated using …

ta113Web serverComputer Networks and Communicationsbusiness.industryComputer scienceRandom projectionDimensionality reductionRandom projectionPrincipal component analysisIntrusion detection systemAnomaly detectionMachine learningcomputer.software_genreCyber securityWeb trafficPrincipal component analysisDiffusion mapAnomaly detectionIntrusion detectionArtificial intelligenceData miningWeb servicebusinesskyberturvallisuuscomputer
researchProduct

An Efficient Network Log Anomaly Detection System Using Random Projection Dimensionality Reduction

2014

Network traffic is increasing all the time and network services are becoming more complex and vulnerable. To protect these networks, intrusion detection systems are used. Signature-based intrusion detection cannot find previously unknown attacks, which is why anomaly detection is needed. However, many new systems are slow and complicated. We propose a log anomaly detection framework which aims to facilitate quick anomaly detection and also provide visualizations of the network traffic structure. The system preprocesses network logs into a numerical data matrix, reduces the dimensionality of this matrix using random projection and uses Mahalanobis distance to find outliers and calculate an a…

ta113random projectionMahalanobis distanceComputer sciencebusiness.industryAnomaly-based intrusion detection systemintrusion detectionDimensionality reductionRandom projectionPattern recognitionIntrusion detection systemcomputer.software_genrekoneoppiminenAnomaly detectionData miningArtificial intelligencetiedonlouhintaAnomaly (physics)mahalanobis distancebusinesscomputerCurse of dimensionality2014 6th International Conference on New Technologies, Mobility and Security (NTMS)
researchProduct